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The widespread popularity of replica exchange and expanded ensemble algorithms for simulating 
complex molecular systems in chemistry and biophysics has generated much interest in discovering 
new ways to enhance the phase space mixing of these protocols in order to improve sampling of 
uncorrelated configurations. Here, we demonstrate how both of these classes of algorithms can be 
considered as special cases of Gibbs sampling within a Markov chain Monte Carlo (MCMC) frame- 
work. Gibbs sampling is a well-studied scheme in the field of statistical inference in which different 
random variables are alternately updated from conditional distributions. While the update of the 
conformational degrees of freedom by Metropolis Monte Carlo or molecular dynamics unavoidably 
generates correlated samples, we show how judicious updating of the thermodynamic state indices — 
corresponding to thermodynamic parameters such as temperature or alchemical coupling variables — 
can substantially increase mixing while still sampling from the desired distributions. We show how 
state update methods in common use can lead to poor mixing, and present some simple, inexpen- 
sive alternatives that can increase mixing of the overall Markov chain, reducing simulation times 
necessary to obtain estimates of the desired precision. These improved schemes are demonstrated 
for several common applications, including an alchemical expanded ensemble simulation, parallel 
tempering, and multidimensional replica exchange umbrella sampling. 

Keywords: replica exchange simulation, expanded ensemble simulation, the method of expanded ensembles, 
parallel tempering, simulated scaling, generalized ensemble simulations, extended ensemble, Gibbs sampling, 
enhanced mixing, convergence rates, Markov chain Monte Carlo (MCMC), alchemical free energy calculations 



I. INTRODUCTION 

A broad category of simulation methodologies known 
as generalized ensemble [1] or extended ensemble [2] algo- 
rithms have enjoyed increasing popularity in the field 
of biomolecular simulation over the last decade. The 
two most popular algorithmic classes within this cat- 
egory are undoubtedly replica exchange, [3] which in- 
cludes parallel tempering [4—6] and Hamiltonian ex- 
change [7-10], among others, and its serial equivalent, 
the method of expanded ensembles [11], which includes 
simulated tempering [12, 13] and simulated scaling [14]. 
In both classes of algorithms, a mixture of thermo- 
dynamic states are sampled within the same simula- 
tion, with each simulation walker able to access multi- 
ple thermodynamic states through a stochastic hopping 
process, which we will generically refer to as consist- 
ing of swaps or exchanges. In expanded ensemble simula- 
tions, the states are explored via a biased random walk 
in state space; in replica exchange simulations, multi- 
ple coupled walks are carried out in parallel without bi- 
asing factors. Both methods allow estimation of equi- 
librium expectations at each state as well as free en- 
ergy differences between states. In both cases, stochas- 
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tic transitions between different thermodynamic states 
can reduce correlation times and increase sampling effi- 
ciency relative to straightforward Monte Carlo or molec- 
ular dynamics simulations by allowing the system avoid 
barriers between important configuration substates. 

Because of their popularity, these algorithms and their 
properties have been the subject of intense study over 
recent years. For example, given optimal weights, ex- 
panded ensemble simulations have been shown to have 
provably higher exchange acceptance rates than replica 
exchange simulations using the same set of thermody- 
namic states [15]. Higher exchange attempt frequen- 
cies have been demonstrated to improve mixing for 
replica exchange simulations [16, 17]. Alternative ve- 
locity rescaling schemes have been suggested to im- 
prove exchange probabilities [18]. Other work has ex- 
amined the degree to which replica exchange simu- 
lations enhance sampling relative to straightforward 
molecular dynamics simulations [19-25]. Numerous 
studies have examined the issue of how to optimally 
choose thermodynamic states to enhance sampling in 
systems with second-order phase transitions [26-32], 
though systems with strong first-order-like phase tran- 
sitions (such as two-state protein systems) remain chal- 
lenging [33, 34]. A number of combinations [35, 36] 
and elaborations [19, 37-40] of these algorithms have 
also been explored. A small number of publications 
have examined the mixing and convergence properties 
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of replica exchange and expanded ensemble algorithms 
with mathematical rigor [41-44], but there remain many 
unanswered questions about these sampling algorithms 
at a deep theoretical level. 

Standard practice for expanded ensemble and replica 
exchange simulations is that exchanges are to be at- 
tempted only between "neighboring" thermodynamic 
states — for example, the states with temperatures im- 
mediately above or below the current temperature in a 
simulated or parallel tempering simulation [4-10]. The 
rationale behind this choice is that states further away in 
state space will have low probability of acceptance due 
to diminished phase space overlap, and thus attempts 
should focus on the states for which exchange attempts 
are most likely to be accepted. Increasing the proxim- 
ity of neighboring thermodynamic states in both kinds 
of simulations can further increase the probability that 
exchange attempts will be accepted. However, restrict- 
ing exchange attempts to neighboring states can then re- 
sult in slow overall diffusion in state space due to the 
larger number of replicas needed to span the thermo- 
dynamic range of interest [45]. Some exchange schemes 
have been proposed to improve this diffusion process, 
such as all-pairs exchange [46], and optimized exchange 
moves [18] but the problem is still very much a chal- 
lenge (see Ref. [47] for a recent comparison). The prob- 
lem of slow diffusion is exacerbated in "multidimen- 
sional" simulations that make use of a 2D or 3D grid of 
thermodynamic states [7, 48, 49], where diffusion times 
in state space increase greatly due to the increase in di- 
mensionality [50]. 

Here, we show how the many varieties of expanded 
ensemble and replica exchange simulations can all be 
considered to be forms of Gibbs sampling, a sampling 
scheme well-known to the statistical inference literature 
[51, 52], though unrelated to simulations in the "Gibbs 
ensemble" for determining phase equilibria [53? , 54]. 
When viewed in this statistical context, a number of al- 
ternative schemes can readily be proposed for updating 
the thermodynamic state while preserving the distribu- 
tion of configurations and thermodynamic states sam- 
pled by the algorithm. By making simple modifications 
to the exchange attempt schemes, we show that great 
gains in sampling efficiency can be achieved under cer- 
tain conditions with little or no extra cost. There is es- 
sentially no drawback to implementing these algorith- 
mic improvements, as the additional computational cost 
is negligible, their implementation sufficiently simple 
to encourage widespread adoption, and there appears 
to be no hindrance of sampling in cases where these 
schemes offer no great efficiency gain. Importantly, we 
also demonstrate that schemes that encourage mixing 
in state space can also encourage mixing of the overall 
Markov chain, reducing correlation times in coordinate 
space, leading to more uncorrelated samples being gen- 
erated for a fixed amount of computer time. 

This paper is organized as follows. In Theory (Sec- 
tion II), we describe expanded ensemble and replica ex- 



change algorithms in a general way, casting them as a 
form of Gibbs sampling. In Algorithms (Section III), we 
propose multiple approaches to the state exchange pro- 
cess in both classes of algorithm with the aim of en- 
couraging faster mixing in among the thermodynamic 
states accessible in the simulation, and hence the overall 
Markov chain. In Illustration (Section IV), we illustrate 
how and why these modified schemes enhance mixing 
of the overall chain for a simple one-dimensional model 
system. In Applications (Section V), we apply these algo- 
rithmic variants to some examples from physical chem- 
istry, using several different common benchmark sys- 
tems from biomolecular simulation, and examine sev- 
eral metrics of simulation efficiency. Finally, we make 
recommendations for the adoption of simple algorith- 
mic variants that will improve efficiency in Discussion 
(Section VI). 



II. THEORY 

Before describing our suggested algorithmic modifi- 
cations (Algorithms, Section III), we first present some 
theoretical tools we will use to analyze expanded en- 
semble and replica exchange simulations in the context 
of Gibbs sampling. 



A. Thermodynamic states and thermodynamic ensembles 

To be as general as possible, we describe the expanded 
ensemble and replica exchange algorithms as sampling 
a mixture of K thermodynamic states. Here, a ther- 
modynamic state is parameterized by a vector of time- 
independent thermodynamic parameters A. For nota- 
tional convenience and to make what follows general, 
we define the reduced potential [55] u(x) of a physical sys- 
tem, 



u(x) — p 



H(x)+pV{x) 



Hirii(x) + ■ 



(1) 



corresponding to its thermodynamic state, where x de- 
notes the configuration of the system specifying any 
physical variables allowed to change, including the vol- 
ume V(x) (in the case of a constant pressure ensem- 
ble) and rii(x) the number of molecules of each of i — 
1 , . . . , M components of the system, in the case of a 
(semi)grand ensemble. The reduced potential is a func- 
tion of the Hamiltonian H, the inverse temperature /3 = 
(UbT)^ 1 , the pressure p, and the vector of chemical po- 
tentials for each of M components fii. Other thermody- 
namic parameters and their conjugate coordinates can 
be included in a similar manner, or some of these can 
be omitted, as required by the physics of the system. 
We denote the set of all thermodynamic parameters by 
A= {fi,H,p, ft,...}. 

We next denote a configuration of the molecular sys- 
tem by x G ft, where il is allowed configuration 
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space, which may be continuous or discrete. A choice 
of thermodynamic state gives rise to set of configura- 
tions of the system that are sampled by a given time- 
independent probability distribution at equilibrium. So 
each x will have associated unnormalized probability 
density q(x), which is a function of A, where q(x) > 
for all x G ft. If we define the normalization constant, or 
partition function, Z as: 

Z = I dxq(x) (2) 
Jn 

we can define a normalized probability density 

n(x) = Z' 1 q{x). (3) 

A physical system in equilibrium with its environ- 
ment obeying classical statistical mechanics will sample 
configurations distributed according to the Boltzmann 
distribution, 

q(x) = e~ u{x) . (4) 

In this paper, we consider a set of K thermodynamic 
states defined by their thermodynamic parameter vec- 
tors, A fe = {p k , H k ,Pk,Pk, ■ • ■}, with k=l,...,K, where 
Hk denotes any modifications of the Hamiltonian H as 
a function of k, including biasing potentials. Each new 
choice of k gives rise to a reduced potential Uk, unnor- 
malized and normalized probability distributions qk (x) 
and ir(x, k), and a partition function Zk- Although in 
this paper, we generally assume a Boltzmann distribu- 
tion, there is nothing to prevent some or all of the states 
from being sampled using non-thermodynamic (non- 
Boltzmann) statistics using alternative choices of the un- 
normalized density qk(x), as in the case of multicanon- 
ical simulations [56] or Tsallis statistics [57]. To ensure 
that any configuration x has finite, nonzero density in all 
K thermodynamic states, we additionally require that 
the same thermodynamic parameters be specified for 
all thermodynamic states, though their values may of 
course differ. 



B. Gibbs sampling 



Assume we can draw samples, either independently 
or through some Markov chain Monte Carlo procedure, 
from the conditional distributions of one or more of the 
variables, n(x\y) or ir(y\x), where the value of the sec- 
ond variable is fixed. To generate a set of sample pairs 
{(x^, j/ 1 )), {x^ 2 \ i/ 2 '), . . .} from 7r(x, y), we can iterate 
the update scheme: 

x^+^yW ~ir(x\y( n) ) 

y(n+D\ x (n+l) „ ^(^(n+l)) 

where x ~ ir denotes that the random variable x is sam- 
pled or "updated") from the distribution ir(x). 

This procedure is termed Gibbs sampling or the Gibbs 
sampler in the statistical literature, and has been em- 
ployed and studied extensively [51, 52]. In many cases, 
it may be possible to draw uncorrelated samples from 
either or both distributions, but this is not required [? 
]. The choice of which variable to update — in this ex- 
ample, x or y — can be either deterministic (e.g. update x 
then y) or stochastic (e.g. a random number determines 
whether x or y is to be updated); both schemes sam- 
ple from the desired joint distribution ir(x, y). However, 
each method has different dynamic properties and can 
introduce different correlation structure in the sequence 
of sample pairs. In particular, we note that a stochas- 
tic choice of which variable to update obeys detailed 
balance, while a deterministic choice obeys the weaker 
balance condition [63]. In both cases, the distribution 
7r(x, y) is preserved. 

In the sections below, we describe how expanded en- 
semble and replica exchange simulations can be consid- 
ered as special cases of Gibbs sampling on the proba- 
bility distribution tt(x, k), which is now a function of 
both coordinates and thermodynamic states, and how 
this recognition allows us to consider simple variations 
of these techniques that will enhance mixing in phase 
space with little or no extra cost. In the algorithms we 
consider here, the thermodynamic state variable k is dis- 
crete, but continuous k are also completely valid in this 
formalism, v 



Suppose we wish to sample from the joint distribu- 
tion of two random variables, x and y. We denote this 
joint distribution by ir(x, y). Often, it is not possible to 
directly generate uncorrelated sample pairs (x,y) from 
the joint distribution due to the complexity of the func- 
tion ir(x,y). In these cases, a standard approach to 
sampling is to use some form of Markov chain Monte 
Carlo (MCMC) [52], such as the Metropolis-Hastings 
algorithm [58, 59] or hybrid Monte Carlo [60]. While 
general in their applicability, MCMC algorithms suffer 
from the drawback that they often must generate corre- 
lated samples, potentially requiring long running times 
to produce a sufficient number of effectively uncorre- 
lated samples to allow the computation of properties to 
the desired precision [61, 62]. 



C. Expanded ensembles 

In an expanded ensemble simulation [11], a single 
replica or "walker") samples pairs (x,k) from a joint 
distribution of configurations x G V and state indices 
ke{l,...,K] given by, 

7r(x, k) oc exp[-w fc (a;) + g k ], (5) 

where gk is an state-dependent weighting factor. This 
space is therefore a mixed, generalized, or expanded en- 
semble which samples from multiple thermodynamic 
ensembles simultaneously, gk is chosen to give a specific 
weighting of each subensemble in the expanded ensem- 
ble, and is generally determined through some iterative 
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procedure [11, 12, 14, 31, 64-66]. The set of g k is fre- 
quently chosen to give each thermodynamic ensemble 
equal probability, in which case g k = — In Z k , but they 
can be set to arbitrary values as desired. 

In the context of Gibbs sampling, an expanded ensem- 
ble simulation proceeds by alternating between sam- 
pling from the two conditional distributions, 



Tr(x\k) 



7r(fc|x) 



-u k (x) 



f Q dxq k (x) J n dxe- U ^ 



e 9k q k (x) 



A A 
k' = l fe'=l 



(6) 



(7) 



In all but trivial cases, sampling from the conditional 
distribution 7r(a;|fc) must be done using some form of 
Markov chain Monte Carlo sampler that generates cor- 
related samples, due to the complex form of u k {x) and 
the difficulty of computing the normalizing constant in 
the denominator [52]. Typically, Metropolis-Hastings 
Monte Carlo [58, 59] or molecular dynamics is used [? ], 
generating an updated configuration that is cor- 

related with the previous configuration x^ n \ However, 
as we will see in Algorithms (Section III A), multiple 
choices for sampling from the conditional distribution 
n(k\x) are possible due to the simplicity of its form. 



D. Replica exchange ensembles 

In a replica exchange, we consider K simulations, 
with one simulation in each of the thermodynamic K 
states. The current state of the replica exchange sim- 
ulation is given by (X, S), where X is a vector of the 
replica configurations, X = {xi,x%, . . . ,xk}, and S = 
{si, . . . , sk} € Sk is a permutation of the state indices 
S = {1, . . . , K} associated with each of the replica con- 
figurations X = {xi, . . . , ijf}. Then: 



A" 



ir(X, S) (x q St (x^ cx exp 



A' 



■ u H (xj) 



(8) 



with the conditional densities therefore given by 



A 



n(X\S)=H 



<S\X) = 



e -u H (xi) 

J n dxe-^M 



exp 


A 

- u Si (xi) 

. i=l 








A 




Yj eX P 


- J2 u s'\Xi) 


S'es K 









(9) 



(10) 



As in the case of expanded ensemble simulations, up- 
dating of configurations X must be by some form 
Markov chain Monte Carlo or molecular dynamics sim- 
ulation, invariably generating configurations with some 



degree of correlation. Unlike the case of expanded en- 
sembles, generating independent samples in the condi- 
tional permutation space is very challenging for even 
moderate numbers of states because of the expense of 
computing the denominator of ir(S\X) [? ], which in- 
cludes a sum over all permutations in the set Sk ■ How- 
ever, as we shall see in Section III B, there are still effec- 
tive ways to generate nearly uncorrelated permutations 
that have improved mixing properties over traditional 
exchange attempt schemes. 



III. ALGORITHMS 

We now describe the algorithms used in sampling from 
the expanded ensemble and replica exchange ensembles 
described in Theory (Section II). We start with the typical 
neighbor exchange schemes commonly used in the liter- 
ature, and then describe additional novel schemes based 
on Gibbs sampling that can encourage more rapid mix- 
ing among the accessible thermodynamic states. 



A. Expanded ensemble simulation 

For an expanded ensemble simulation, the conditional 
distribution of the state index k given x is, again: 

,9k~ Uk(x) 



n(k\x) = K 



k' = l 



We can use any proposal/acceptance scheme that en- 
sures this conditional distribution is sampled in the long 
run for any fixed x. We can choose at each step to sample 
in k or x depending according to some fixed probabil- 
ity p, in which case detailed balance is obeyed. We can 
also alternate N k and N x steps of k and x sampling, re- 
spectively. Although this algorithm does not satisfy de- 
tailed balance, it does satisfy the weaker condition of bal- 
ance [63] which is sufficient to preserve sampling from 
the joint stationary distribution ir(x, k). In the cases that 
proposal probabilities are based on past history how- 
ever, the algorithm will not preserve the equilibrium 
distribution [67]). 



1. Neighbor exchange 

In the neighbor exchange scheme, the proposed state 
index j given the current state index i is chosen ran- 
domly from one of the neighboring states, i ± 1, with 
probability, 



a(j\x,i) 




(11) 
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and accepted with probability, 



P.. 



accept 



■ fi e s 3~"J (x) 1 



if 3 i {1, 
else 



(12) 



This scheme was originally suggested by Marinari and 
Parisi [12] and has been used extensively in subsequent 
work [35, 68]. A slight variation of this scheme considers 
the set {1, . . . , K} to lie on a torus, such that state i + nK 
is equivalent to state i for integral n, with the proposal 
and acceptance probability otherwise left unchanged. 

An alternative scheme avoids having to reject choices 
of j that lead to j ^ {1, . . . , K} by modifying the pro- 
posal scheme, 



, if ke{2,...,K-l},\j-i\ =1 

1 Xi = l,j = i + 1<K 

1 if i = K, j = i - 1 > 1 

else 



(13) 



and modifying the acceptance criteria for these two 
moves to be [35] 

U\ x ,i)= T foH,-—-^K (14) 



p. 



accept 



to include the correct Metropolis-Hastings ratio of pro- 
posal probabilities. 



2. Independence sampling 

The most straightforward way of generating an un- 
correlated state index i from the conditional distribution 
7r(fc|z) is by independence sampling, in which we propose 
an update of the state index i by drawing a new j from 
7r(i|x) with probability 



<x{j\x,i) = Tv(i\x) 



(15) 



and always accepting this new j. While well-known in 
the statistical inference literature [52] — and the update 
scheme most closely associated with the use of the Gibbs 
sampler there — this scheme has been recently discov- 
ered independently in the context of molecular simula- 
tion [25]. A straightforward way to implement this up- 
date scheme is to generate a uniform random number 
r on the interval [0, 1), and select the smallest k where 



and accepted with probability, 



. 1 — 7r(i|x, i) 
P —^ X ' l)=mm \ 1 'l-n(j\x,i) 



(17) 



This scheme has the surprisingly property that, despite 
including a rejection step (unlike the independence sam- 
pling in Section III A 2 above), the mixing rate in 7r(fc|x) 
can be proven to be greater than that of independence 
sampling [69], using the same arguments that Peskun 
used to demonstrate the optimality of the Metropolis- 
Hastings criteria over other criteria for swaps between 
two states. This can be rationalized by noting that 
Metropolized independence sampling updates will al- 
ways try move away from the current state, whereas 
standard independence sampling has some nonzero 
probability to propose to remain in the current state. 



4. Restricted range sampling 

In some situations, such as simulated scaling [14] 
or other schemes in which the Hamiltonian differs in 
a non-trivial way among thermodynamic states, there 
may be a non-negligible cost in evaluating the unnor- 
malized probability distributions <j% (x) for all k. Because 
transitions to a states with minimal phase space over- 
lap will have very low probability, prior knowledge of 
which states have the highest phase space overlap could 
reduce computational effort with little loss in sampling 
efficiency if states with poor overlap are excluded from 
consideration for exchange. 

One straightforward way to implement such a re- 
stricted range sampling scheme is to define a set of pro- 
posal states Si for each state i e {1, . . . , K}, with the 
requirement that i € Sj if and only if j 6 Si, and pro- 
pose transitions from the current (ar, i) to a new state j 
with probability, 



e gj-"jO> 
Ot{j\X,l) = < kts t 



3&<Si 
3 i Si 



(18) 



This proposal is accepted with probability, 



3. Metropolized independence sampling 

In what we term a Metropolized independence sampling 
move [69], a new state index k' is proposed from the 
distribution, 



a(j\x,i) = I i-'WM) JT 
j = i 



(16) 



^accept 01 M) = min 



/ ^ e 9k ~ Uk ^ 
- fcgSi 

' J2 eSk'-9k'( x ) 
\ k'eSj 



(19) 



We can easily see that this scheme satisfies detailed 
balance for fixed x. The probability the sampler is ini- 
tially in i £ Sj and transitions to j G Si, where j ^ i, is 
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given by, 

ir(i\x)a{j\x,i)P accept (j\x,i) 



,9i-Ui(x) 



Z{S, 



all) 



egj -Uj(x) 



Z{Si) 

e g j -u j (x) e g i -u i (x) 
e gj-Uj(x) 



min 1 



Z(Si) 



(20) 



Z(Sj) 

min(Z- 1 (5 4 ),^ 1 (5,))](21) 



~zjs~y 



min 1 



Z(Sj) 
Z(Si) 



7T ( J» O. ( i | X , j ) Paccept ( 1 \ X , j 



(22) 
(23) 



where Z(Si) - £ feeS< e^-«*(*), and 5 aU = {1, . . .,#}• 
This is simply the detailed balance condition, ensur- 
ing that this scheme will sample from the distribution 
ir(i\x). Therefore, this scheme samples from the station- 
ary probability ir(j\x). 

For example, we can define Si = {i — n, . . . , i + n}, 
with n <C K, for all i, making appropriate adjustments 
to this range at i < n and i > K — n. Then we 
only need to compute the reduced potentials for states 
{min(l, i — 2n), . . . , max(K, i + 2n)}, rather than all states 
{1, . . . , K}. The additional evaluations for {min(l, i — 
2n), . . . , i — n — 1} and {max(i + ri + 1, K) . . . , max(i^, i + 
2n)} are required to ensure that we can calculate both 
sums in the acceptance criteria (Eq. 19). 

Restricted range sampling simply reduces to indepen- 
dence sampling, as presented in Section III A 2, when 
Si = {1, ...,K}, and all proposals are therefore ac- 
cepted. We also note that Metropolized independence 
sampling, in Section III A 3 is exactly equivalent to us- 
ing the restricted range scheme with Si = {1, . . . , K} 
excluding i, such that a(i\x,i) = for all i. Any other 
valid scheme of sets Si can be Metropolized by remov- 
ing i from Si . 

Clearly, other state decomposition schemes exist, 
though the efficiency of such schemes will almost cer- 
tainly depend on the underlying nature of the thermo- 
dynamic states under study. It is possible to define 
state schemes that preserve detailed balance, but that 
are not ergodic, such as Si = S3 = S$ = {1,3,5} 
and S 2 — 5 4 = S 6 = {2,4,6} for K — 6, so some 
care must be taken. In most cases, users will likely use 
straightforward rules to find locally defined sets such 
as Si — {i — n, . . . , i + n} or the Metropolized version 
Si = {i — n, . . . , i — 1, i + 1, . . . , i + n}, and ergodicity as 
well as detailed balance will be satisfied. Further anal- 
ysis of the performance tradeoffs involved in choices of 
the sets, situations where sets might be chosen stochas- 
tically, or more efficient choices of sets that satisfy only 
balance is beyond the scope of this study. 



5. Other schemes 

The list above is by no means intended to be 
exhaustive — many other schemes can be used for updat- 
ing the state index k, provided they sample from 7r(fc|x). 



Compositions of different schemes are also permitted — 
even something simple as application of the neighbor 
exchange scheme a number of times, rather than just 
once, could potentially improve mixing properties at lit- 
tle or no additional computational cost. 



B. Replica exchange simulation 

1. Neighbor exchange 

In standard replica exchange simulation algorithms, 
an update of the state permutation S of the (X, S) 
sampler state only considers exchanges between neigh- 
boring states [4-10]. One such scheme involves at- 
tempting to exchange either the set of state index 
pairs {(1, 2), (3, 4), . . .} or {(2, 3), (4, 5), . . .}, chosen with 
equal probability [? ]. 

Each state index pair exchange attempt is at- 

tempted independently, with the exchange of states i 
and j associated with configurations Xi and Xj, respec- 
tively, accepted with probability 



-Paccept (xi, i, Xj,j) = min <^ 1 



3 -[Ui(Xj) + Uj(Xi)] 

-[u i {x i )+u j (x j)} 



(24) 



2. Independence sampling 

Independence sampling in replica exchange would 
consist of generating an uncorrelated, independent sam- 
ple from ir(S\X). The most straightforward scheme for 
doing so would require compiling a list of all possible K\ 
permutations of S, evaluating the unnormalized proba- 
bility exp [— u Si (xi)] for each, normalizing by their 
sum, and then selecting a permutation 5" according to 
this normalized probability. Even if the entire K x K 
matrix U = (liy) with Uij = Ui(xA is precomputed, the 
cost of this sampling scheme becomes impractical even 
for modestly large K. 

Instead, we note that an effectively uncorrelated sam- 
ple from ir(S\X) can be generated by running an MCMC 
sampler scheme for a short time with trivial or small ad- 
ditional computational expense. For each step of the 
MCMC sampler, we pick a pair of state indices 
with i 7^ j, uniformly from the set {1, . . . , K}. The state 
pair associated with the configurations Xi and Xj are 
swapped with the same replica exchange Metropolis- 
like criteria shown in Eq. 24, with the labels of the states 
updated after each swap. If we precompute the matrix 
U, then these updates are extremely inexpensive, and 
many Monte Carlo update steps of the state permuta- 
tion vector S can be taken to decorrelate from the pre- 
vious sample for a fixed set of configurations X, effec- 
tively generating an uncorrelated sample S' ~ P(S\X). 

In the case where all « !3 are equal, then the number of 
swaps required is K In K — a well-known result due to 
Aldous and Diaconis [70]. Empirically, we have found 
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that swapping K 3 to K 5 times each state update itera- 
tion appears to be sufficient for the molecular cases ex- 
amined in this paper and in our own work without con- 
suming a significant portion of the replica exchange iter- 
ation time, but further experimentation may be required 
for some systems. Instead of performing random pair 
selections, we could also apply multiple passes of the 
neighbor exchange algorithm (Section III B 1 ) . We note 
that complete mixing in state space is not a requirement 
for validity of the algorithm, but increasing the number 
of swaps will lead to increased space sampling until the 
limit of independent sampling is reached. 

The method of multiple consecutive state swaps be- 
tween configuration sampling is not entirely novel — we 
have heard several anecdotal examples of people exper- 
imenting with multiple consecutive state swaps, with 
sparse mentions in the literature [71, 72]. However, we 
believe this is the first study to characterize the theory 
and properties of this particular modification of stan- 
dard replica exchange. 

For parallel tempering, in which only the inverse tem- 
perature /3k varies with state index k, computation of 
U is trivial if the potential energies of all K states are 
known. On the other hand, computation of all Ui(xj) 
for all i, j = 1, . . . , K may be time-consuming if the po- 
tential energy must be recomputed for each state, such 
as in an alchemical simulation. If the Bennett accep- 
tance ratio (BAR) [73] or the improved multistate ver- 
sion MBAR [55] are used to analyze data generated dur- 
ing the simulation, however, all such energies are re- 
quired anyway, and so no extra work is needed if the 
state update interval matches the interval at which en- 
ergies are written to disk. Alternatively, if the number 
of Monte Carlo or molecular dynamics time steps in be- 
tween each state update is large compared to K, the 
overall impact on simulation time of the need to com- 
pute U will be minimal. 



3. Other schemes 

The list of replica exchange methods above is by no 
means exhaustive — other schemes can be used for up- 
dating the state index k, provided they sample from the 
space of permutations ir(S\X) in a way that preserves 
the conditional distribution. For example, it may be effi- 
cient for a node of a parallel computer to perform many 
exchanges only among replicas held in local memory, 
and to attempt few exchanges between nodes due to net- 
work constraints. Compositions of different schemes are 
again also permitted. 



C. Metrics of efficiency 

There is currently no universally accepted metric for 
assessing sampling efficiency in molecular simulation, 
and thus it is difficult to quantify exactly how much 



our proposed algorithmic modifications improve sam- 
pling efficiency. In the end, efficient algorithms will de- 
crease the computational effort to achieve an estimate of 
the desired statistical precision for the expectations or 
free energy differences of interest. Unfortunately, this 
can depend strongly on property of interest, the ther- 
modynamic states that are being sampled, and the dy- 
namics of the system studied. While there exist met- 
rics that describe the worst case convergence behavior 
by approximating the slowest eigenvalue of the Markov 
chain [74, 75], the worst case behavior can often dif- 
fer from practical behavior by orders of magnitude [76]. 
Here, we make use of a few metrics that will help us un- 
derstand the time scale of these correlations in sampling 
under practical conditions. 

Complex systems often get stuck in metastable states 
in configuration space with residence times a substan- 
tial fraction of the total available simulation time. This 
dynamical behavior hinders the sampling of uncor- 
related configurations by molecular dynamics simula- 
tion or Metropolis Monte Carlo schemes [77, 78]. Sys- 
tems can remain stuck in these metastable traps even 
as a replica in an expanded ensemble or replica ex- 
change simulation travels through multiple thermody- 
namic states [79], either because the trap exists in mul- 
tiple thermodynamic states or because the system does 
not have enough time to escape the trap before return- 
ing to states where the trap exists. While approaches 
for detecting and characterizing the kinetics of these 
metastable states exist [79, 80], the combination of con- 
formation space discretization error and statistical error 
makes the use of these approaches to compute relax- 
ation times in configuration space not ideal for our pur- 
poses. 

Here, we instead consider three simple statistics 
of the observed state index of each replica trajec- 
tory as surrogates to assess the improvements in 
overall efficiency of sampling. Instead of consider- 
ing the full expanded ensemble simulation trajectory 
{(a;*- ), fc(°)), (a;W , fcD), . . .} or the replica exchange sim- 
ulation trajectory {(X^°\ S^), (X^\S^), . . .}, we con- 
sider the trajectory of individual replicas projected 
onto the sequence of thermodynamic state indices s = 
{sq, Si, . . .} visited during the simulation. In long replica 
exchange simulations, each replica executes an equiv- 
alent random walk, and statistics can be pooled [81]. 
If significant metastabilities in configuration space ex- 
ist, we hypothesize that these configurational states will 
have different typical reduced potential u(x) distribu- 
tions, and therefore induce metastabilities in the state 
index trajectory s as well that will be detectable by the 
methods described below. Each of the measures pro- 
vides a different way to interpret the mixing of the sim- 
ulation in state space; we will refer to all of them in the 
rest of the paper as "mixing times." 
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1. Relaxation time from empirical state transition matrix, ti 



2. Correlation time of the replica state index, r a 



One way to characterize how rapidly the simulation 
is mixing in state space is to examine the empirical transi- 
tion matrix among states, the K x K row-stochastic ma- 
trix T. An individual element of this matrix, Ty, is the 
probability that an expanded ensembles or replica ex- 
change walker currently in state i will be found in state 
j the next iteration. From a given expanded ensemble 
or replica exchange simulation, we can estimate T by 
examining the expanded ensemble trajectory history or 
pooled statistics from individual replicas, 



T, 



Ni 



« K 
k=l 



(25) 



where Nij is the number of times the replica is ob- 
served to be in state k one update interval after being 
in state i. To obtain a transition matrix T with purely 
real eigenvalues, we have assumed both forward and 
time-reversed transitions in state indexes are equally 
probable, which is true in the limit of infinite time for 
all methods described in this paper. To assess how 
quickly the simulation is transitioning between differ- 
ent thermodynamic states, we compute the eigenvalues 
{(iiffjQ, . . . , /j,k} of T and sort them in descending or- 
der, such that that 1 = > /x 2 > ■ • • > Hk- If M2 = 
1, the Markov chain is decomposable, meaning that two 
more subsets of the thermodynamic states exist where 
no transitions have been observed between these sets, 
a clear indicator of very poor mixing in the simulation. 
In this case, the thermodynamic states characterized by 
{Ai, . . . , Xk} should be adjusted, or additional thermo- 
dynamic states inserted to enhance overlap in problem- 
atic regions. Several schemes for optimizing the choice 
of these state vectors exist [26-32], but are beyond the 
scope of this work to discuss here. 

If the second-largest eigenvalue //2 is such that < 
[i2 < 1 we can estimate a corresponding relaxation time 
t 2 as 



T-2 



1 - A*2 



(26) 



where r is the effective time between exchange attempts. 
t-2 then provides an estimate of the total simulation time 
required for the autocorrelation function in the state in- 
dex few of a replica at iteration n of the simulation to de- 
cay to 1/e of the initial value. This estimate holds if the 
time scale of decorrelation in the configurational coordi- 
nate x is fast compared to the decorrelation of the state 
index k; that is, if essentially uncorrelated samples could 
be drawn from n(x\k) for each update of x^ n+1>> \k^ n \ Be- 
cause configuration updates for useful molecular prob- 
lems generally have long correlation times, this r 2 time 
represents a lower bound on the observed correlation 
time for both the state index k^ n ' and the configuration 

r (n) 



As a more realistic estimate of how quickly correla- 
tions in the state index k^ n ' decay in a replica trajectory, 
we also directly compute the correlation time of the state 
index history using the efficiency computation scheme 
described in Section 5.2 of [81], where r ac is equal to the 
integrated area under the autocorrelation function. For 
replica exchange simulations, where all replicas execute 
an equivalent walk in state space, the unnormalized au- 
tocorrelation functions were averaged over all replicas 
before computing the autocorrelation time by integrat- 
ing the area under the autocorrelation function. This 
time, T ac , gives a practical estimate of how much sim- 
ulation time must elapse for correlations in the state in- 
dex to fall to 1/e. The statistical inefficiency is the number 
of samples required to collect each uncorrelated sam- 
ple, and can be estimated for a Markovian process by 
2r ac + 1, with r ac in units of time between samples. 



3. Average end-to-end transit time of the replica state index, r cnd 

As an additional estimate of practical efficiency, we 
measure the average end-to-end transition time for the 
state index, r en( j. This is the average of the time elapsed 
between the first visit of the state index k^ to one 
end point (k = 1 or k = K) after visiting the op- 
posite end point (k = K or k = 1, respectively). 
This metric of efficiency, or the related "round-trip" 
time, has seen common use in diagnosing efficiency 
for simulated-tempering and replica exchange simula- 
tions [18, 28, 82, 83]. 



IV. MODEL ILLUSTRATION 

To illustrate the motivation behind the idea that 
speeding up sampling in one coordinate — the state in- 
dex or permutation — will enhance sampling of the over- 
all Markov chain of (x, k) or (X, P), we consider a simu- 
lated tempering simulation in a one-dimensional model 
potential, 



U{x) = 10(ar- lf{x + \f 



(27) 



shown in the top panel of Figure 1, along with the corre- 
sponding stationary distribution ir(x) at several temper- 
atures from ksT = 1 to fcsT = 10. To simplify our illus- 
tration, we directly numerically compute the log-weight 
factors 



9k 



In 



dxe-^ u ^ 



(28) 



so that the simulation has an equal probability to be in 
each of the K states. 

The K inverse temperatures that can be visited 
during the simulated tempering simulation are chosen 
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FIG. 2. Autocorrelation times as function of number of tem- 
peratures for one-dimensional model. The integrated auto- 
correlation time r fe for state index k (top) and r x for position x 
(bottom) as a function of the number of exponentially-spaced 
temperatures spanning the range k B T £ [1, 10]. The correla- 
tion times for neighbor swap (black points) and independence 
sampling updates (red stars) are shown for each. Error bars 
represent one standard error of the mean. 




FIG. 1. Simulated tempering for a one-dimensional model. 

Top panel: Potential energy U(x) and stationary probabilities 
ir(x) for one-dimensional two-well model potential at 16 tem- 
peratures spanning k B T = 1, where barrier crossing is hin- 
dered, to k B T = 10, where barrier crossing is rapid. Middle 
panel: Temperature index, k, and position, x, histories for a 
simulated tempering simulation where neighbor swap in tem- 
perature are attempted each iteration. Bottom panel: Temper- 
ature index, k, and position, x, histories for a simulated tem- 
pering simulation where independence sampling of the tem- 
perature index is performed each iteration. Only the first 5 000 
iterations are shown, though simulations of 10 6 iterations were 
conducted to estimate the correlation times Tk and t x printed 
above each panel, shown in number of iterations required to 
produce an effectively uncorrelated sample in either k or x, re- 
spectively. Statistical uncertainties shown represent one stan- 
dard error of the mean. 
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to be geometrically spaced, 

k = i -(k-i)/(K-i) i ov k = l,... 1 K (29) 

Each iteration of the simulation consists of an update 
of the temperature index fc using either neighbor ex- 
change (Section IIIA1) or independence sampling up- 
dates (Section III A 2), followed by 100 steps of Metropo- 
lis Monte Carlo [58, 59] using a Gaussian proposal 
with zero mean and standard deviation of 0.1 in the 
x-coordinate. Simulations are initiated from (xo,fco) = 

(-1.1). 

Illustrative trajectories for K = 16 are shown in the 
second and third panels of Figure 1, along with the cor- 
relation times Tfc and t x computed for the temperature 
index k and the configurational coordinate x, respec- 
tively, from a long trajectory of 10 6 iterations. Indepen- 
dence sampling in state space k greatly reduces the cor- 
relation time, and hence statistical inefficiency, in k com- 
pared to neighbor sampling. Importantly, because k and 
x are coupled, we clearly see that increasing the mixing 
in the index k also substantially reduces the correlation 
time in the configurational coordinate x. We find that 
t x — 9.6 ± 0.2 for independence sampling, compared to 
24.1 ± 0.9 for neighbor moves. 

Figure 2 compares the correlation times for k and x 
estimated from simulations of length 10 6 for different 
numbers of temperatures spanning the same range of 
ksT e [1, 10], with temperatures again geometrically 
spaced according to Eq. 29. As the number of tempera- 
tures spanning this range increases, the correlation time 
in the temperature coordinate k increases, as one would 
expect for a random walk on domains of increasing size. 
Notably, increasing the number of temperatures also has 
the effect of increasing the correlation time of the con- 
figuration coordinate x. When independence sampling 
is used to update the temperature index k instead, the 
mixing time in k is greatly reduced, and both correla- 
tion times Tfc and r x remain small even as the number of 
temperatures is increased. 

V. APPLICATIONS 

To demonstrate that the simple state update modifica- 
tions we describe in Section III lead to real efficiency im- 
provements in practical simulation problems, we con- 
sider three typical simulation problems: An alchemi- 
cal expanded ensemble simulation of united atom (UA) 
methane in water to compute the free energy of trans- 
fer from gas to water; a parallel tempering simulation 
of terminally-blocked alanine dipeptide in implicit sol- 
vent; and a two-dimensional replica exchange umbrella 
sampling simulation of alanine dipeptide in implicit sol- 
vent to compute the potential of mean force. These 
systems are small compared to modern applications of 
biophysical and biochemical interest. However, they 
are realistic enough to demonstrate the fundamental is- 
sues in multiensemble simulations, but still sufficiently 



tractable that a large quantity of data can be collected to 
prove that the differences in efficiency of our proposed 
mixing schemes are highly significant. 



A. Expanded ensemble alchemical simulations of 
Lennard-Jones spheres in water 

1. United atom methane 

We first compare different types of Gibbs sampling 
state space updates in an expanded ensemble alchemical 
simulation of the kind commonly used to compute the 
free energy of hydration of small molecules [35, 82]. If 
the state mixing schemes proposed here lead to more ef- 
ficient sampling among alchemical states, a larger num- 
ber of effectively uncorrelated samples will be generated 
for a simulation of a given duration, and thus require 
less computation effort to reach the desired degree of 
statistical precision. 

An OPLS-UA united atom methane particle {a = 
0.373 nm, e = 1.230096 kj/mol) was solvated in a cubic 
simulation cell containing 893 TIP3P [84] waters. For all 
simulations, a modified version of GROMACS 4.5.2 [85] 
was used [? ]. A velocity Verlet integrator [86] was 
used to propagate dynamics with a timestep of 2 fs. A 
Nose-Hoover chain of length 10 [87] and time constant 
tt = 10.0 ps was used to thermostat the system to 298 
K. A measure-preserving barostat was used according to 
Tuckerman et al. [88, 89] to maintain the average system 
pressure at 1 atm, with t p = 10.0 ps and compressibility 
4.5 x 10~ 5 bar -1 . Rigid geometry was maintained for 
all waters using the analytical SETTLE scheme [90]. A 
neighbor list and PME cutoff of 0.9 nm were used, with 
a PME order of 6, spacing of 0.1 nm and a relative toler- 
ance of 10~ 6 at the cutoff. The Lennard-Jones potential 
was switched off, with the switch beginning at 0.85 nm 
and terminating at the cutoff of 0.9 nm. An analytical 
dispersion correction was applied beyond the Lennard- 
Jones cutoff to correct the energy and pressure computa- 
tion [91]. The neighbor list was updated every 10 steps. 

A set of K = 6 alchemically-modified thermodynamic 
states were used in which the Lennard-Jones interac- 
tions between the methane and solvent were eliminated 
using a soft-core Lennard-Jones potential [92], 

U ij (r;\)=4e ij \f(r;\)[l-f(r;\)] 
f(r;\) = la(l-\) + (r/a lJ ) 6 }- 1 (30) 

with values of the alchemical coupling parameter 
chosen to be {0.0, 0.3, 0.6, 0.7, 0.8, 1.0}. 

To simplify our analysis of efficiency, we fix the log- 
weights gk to "perfect weights," where all states are vis- 
ited with equal probability. This also decouples the issue 
of efficiency of state updates with efficiency of different 
weight update schemes, of which many have been pro- 
posed [11, 12, 14, 31, 64-66]. The "perfect" log-weights 
were estimated for this system as follows: A 1 ns ex- 
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mixing times (ps) relative speedup 

T2 Tac T cnd TN T2 Tac T cnd TN 

1 state move attempted every 0.1 ps 

neighbor exchange 1.693 ± 0.008 6.7 ± 0.4 11.9 ± 0.2 5.9 ± 0.4 L0 L0 L0 L0 

independence sampling 0.771 ± 0.004 6.2 ± 0.2 7.2 ± 0.1 5.4 ± 0.2 2.20 ± 0.02 1.08 ± 0.08 1.65 ± 0.04 1.10 ± 0.09 

Metropolized indep. 0.645 ± 0.003 4.6 ± 0.2 6.6 ±0.1 4.4 ± 0.2 2.62 ± 0.02 1.5 ± 0.1 1.81 ± 0.04 1.3 ± 0.1 

1 000 state moves attempted every 0.1 ps 

neighbor exchange 0.764 ± 0.006 4.9 ± 0.3 7.2 ± 0.1 4.6 ± 0.3 1.0 1.0 1.0 1.0 

independence sampling 0.769 ± 0.005 4.8 ± 0.2 7.2 ±0.1 4.5 ± 0.2 0.99 ± 0.01 1.01 ± 0.08 1.00 ± 0.02 1.01 ± 0.07 

Metropolized indep. 0.774 ± 0.005 5.0 ± 0.2 7.5 ±0.1 4.7 ± 0.2 0.99 ± 0.01 0.98 ± 0.08 0.96 ± 0.02 0.98 ± 0.07 

1 state move attempted every 5 ps 

neighbor exchange 85.8 ± 2.3 177.7 ± 17.6 330.0 ± 16.1 105.3 ± 12.1 L0 L0 L0 TTO 

independence sampling 39.0 ± 0.9 69.2 ± 6.1 141.1 ± 4.7 49.1 ± 3.8 2.20 ± 0.08 2.6 ± 0.3 2.3 ± 0.1 2.1 ± 0.3 

Metropolized indep. 31.8 ±0.4 51.4 ±1.9 115.7 ±3.4 37.4 ± 1.4 2.70 ± 0.08 3.5 ± 0.4 2.9 ± 0.2 2.8 ± 0.3 

TABLE I. Efficiency measures for expanded ensemble alchemical simulation of united atom methane in water. Times measur- 
ing mixing in state space are: ti, estimated from second eigenvalue of the empirical state transition matrix; r ac , estimated from 
autocorrelation function of the alchemical state index; r on d, estimated average end-to-end transit time for the alchemical state 
index, tn is a structural parameter, the autocorrelation function of the number of TIP3P oxygen molecules within 0.3 nm (87.5% 
of the Lennard-Jones a t j (0.3428 nm)) of the center of the united atom methane particle. The relative speedup in sampling efficiency 
is given relative to the standard neighbor exchange scheme. 



panded ensemble simulation using independence sam- 
pling was run, with weights cjk initialized to zero, then 
adjusted using a Wang-Landau scheme [14], until occu- 
pancy of each state was roughly even to within statis- 
tical noise. With these approximate weights, a 2 ns ex- 
panded ensemble simulation using independence sam- 
pling with fixed weights was run, and the free energy 
of each state was estimated using MBAR [55]. The 
log-weights gk were set to these estimated free ener- 
gies, which were {0.0, 0.32, -0.46,-1.67, -2.83, -3.66}, 
in units of fc^T. Simulations using these weights devi- 
ated by an average of 5% from flat histogram occupancy 
in states, with an average maximum deviation over all 
simulations of less than 10%. 

The state update procedure was carried out either ev- 
ery 0.1 ps (frequent update) or 5 ps (infrequent update), 
in order to test the effect of state updates that were much 
faster than, or on the order of, the conformational cor- 
relation times of molecular dynamics, as water orien- 
tational correlation times are a few picoseconds [93]. 
Production simulations with fixed log-weights were run 
with for 25 ns (250 000 state updates), for frequent up- 
dates, or 100 ns (20 000 state updates), for infrequent 
updates. Three types of state moves were attempted: (1) 
neighbor exchange moves (described in Section III A 3), 
(2) independence sampling (Section III A 2), and (3) 
Metropolized independence sampling (Section III A 3). 
In the case of frequent updates, we additionally per- 
formed 1 000 trials of the state update every 0.1 ps, in- 
stead of a single update, before returning to coordinate 
update moves with molecular dynamics. 

Statistics of the observed replica trajectories are 
shown in Table I. All three mixing efficiency measures 
of the state index trajectories described in Section III C 
were computed: relaxation time of the empirical state 
transition matrix (T2), autocorrelation of the state func- 
tion (T ac ), and average end-to-end distance (r en( j). 



We additionally look at a measure of correlation in the 
coordinate direction. For each configuration, we exam- 
ine the number of O atoms of the water molecules N that 
are found in the interior of the united atom methane, set 
to be 0.3 nm (or 87.5% of the Lennard-Jones try = 0.3428 
nm) from the center. We then compute the autocorre- 
lation function of tm of this variable, which is affected 
both by the dynamics of the state and the dynamical re- 
sponse of the system to changes in state. Uncertainties 
in these time autocorrelation functions are computed by 
subdividing the trajectories into Ns = 10 subtrajecto- 
ries, computing the standard error, and then dividing by 
\/Ns to obtain standard error of the Ns x longer trajec- 
tory. Uncertainties changed by less than 5% when com- 
puted with Ns — 20 for frequent update simulations, 
and less than 10% for infrequent update simulations. 

The relaxation time t 2 estimated from the second 
eigenvalue of the empirical state transition matrix (Sec- 
tion III CI) does appear to provide a lower bound for 
the other estimated mixing times. For the infrequent 
state updates, it is only about 25% smaller than t^- This 
suggests that when transition times in state space are 
of the same order of magnitude as conformational rear- 
rangements T2 is not only a lower bound, but is charac- 
teristic of sampling through the joint state-configuration 
space. We additionally note that mixing time T2 is em- 
pirically exactly proportional to the update frequency; 
the mixing times for the infrequent update state are ex- 
actly (5 ps/0.1 ps) = 50 times longer than the frequent 
state mixing times, a direct consequence of the fact that 
the probability of successful state transitions is directly 
proportional to the rate of attempted transitions. 

For both the frequent and infrequent state updates, in- 
dependence sampling and Metropolized independence 
sampling yield a clear, statistically significant speedup 
by all sampling metrics. This speedup is accentuated for 
infrequent updates. For frequent updates, the speedup 
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is between 1.3 and 2.6 for Metropolized independence 
sampling, while for infrequent updates, it ranges be- 
tween 2.7 and 3.5, as seen in Table I. As expected, 
attempting many state updates in a row (1 000 state 
moves) using any of the state update schemes effectively 
recapitulates the independence sampling scheme. Re- 
peated application of any method that obeys the balance 
condition will eventually converge to the same indepen- 
dent sampling distribution. If state updates are rela- 
tively inexpensive, then any state update scheme that 
ensures the correct distribution is sampled can be iter- 
ated many times, effectively resulting in an indepen- 
dence sampling scheme. Interestingly, this means that 
Metropolized independence sampling becomes worse 
when repeated several times, as it eventually turns into 
simple independence sampling. 

Although the acceleration of independence sampling 
over neighbor exchange is more dramatic with longer 
intervals between state updates, more frequent state up- 
dates appear to always be better than less frequent up- 
dates. For example, neighbor exchange with more fre- 
quent updates achieves shorter correlation times that 
either independence sampling scheme for infrequent 
updates. Increased sampling frequency in state space 
seems to be a good idea. [16, 17] It is possible that there 
are conditions where this conclusion might not be true; 
collective moves like long molecular dynamics trajec- 
tories of polymers might become disrupted by too fre- 
quent changes in state space. Additional study is re- 
quired to understand this phenomena. We finally note 
that for this particular system, Metropolized indepen- 
dence sampling is slightly but clearly better than inde- 
pendence sampling in all sampling measures, providing 
a strong incentive to use Metropolized independence 
sampling where convenient. 



2. Larger Lennard-]ones spheres 

As united atom methane is much smaller than typi- 
cal biomolecules of interest, we additionally examined 
an alchemical expanded ensemble simulation of a much 
larger Lennard-Jones sphere. In this case, the sphere has 
an — 1.09 nm and en = 1.230096 kj/ mol, again solvated 
in a cubic simulation cell containing 893 TIP3P [84] wa- 
ters. These parameters result in a sphere- water er^ = 
0.561 nm, and therefore a particle 5.0 times as large in 
volume as the UA methane sphere. Because of the larger 
volume of the solute, K = 18 alchemically-modified 
thermodynamic states were required, with A = [0, 0.15, 
0.3, 0.45, 0.55, 0.6, 0.64, 0.66, 0.68, 0.70, 0.72, 0.75, 0.78, 
0.81, 0.84, 0.87, 0.90, 1.0]. All other simulation param- 
eters (other than simulation length) were the same as 
the UA methane simulations. Log- weights g k for the 
equilibrium expanded ensemble simulation were deter- 
mined in the same manner as for united atom methane, 
except that a 15 ns simulation was used to generate 
the data for MBAR, yielding weights g k = {0.0, 1.74, 



2.96, 3.39, 2.84, 2.01, 0.73, -0.34, -1.75, -3.35, -4.96, 
-7.19, -9.11, -10.70, -11.98, -12.98, -13.72, -14.65}. 
Frequent state updates were performed every 0.1 ps, 
but infrequent state moves were performed every 1 ps 
rather than 5 ps to obtain better statistics for the larger 
molecule. The production expanded ensemble simula- 
tions were run for a total of 100 ns for frequent exchange, 
and 250 ns for infrequent exchange. The same three 
types of moves in state space were attempted as with 
UA methane. 

Statistics of the observed replica trajectories are 
shown in Table II. All three convergence rate diagnostics 
of the state index trajectories described in Section III C 
were computed. In general, the relaxation time esti- 
mated from the second eigenvalue of the empirical state 
transition matrix (Section III C 1) again provides a lower 
bound for the other computed relaxation times. For 
the infrequent sampling interval T2 is of the same or- 
der of magnitude (2 to 5 times less) than the other sam- 
pling measures. Again, for both the frequent (0.1 ps) 
and infrequent (1 ps) state update intervals, indepen- 
dence sampling and Metropolized independence sam- 
pling yields a clear speedup over neighbor exchange. 
The improvement in sampling efficiency appears to be 
valid for both small and large particles. 



B. Parallel tempering simulations of terminally-blocked 
alanine peptide in implicit solvent 

We next consider a parallel tempering simulation, a 
form of replica exchange in which the thermodynamic 
states differ only in inverse temperature A sys- 
tem containing terminally-blocked alanine (sequence 
Ace-Ala-Nme) was constructed using the LEaP pro- 
gram [94] from the AmberTools 1.2 package with bug- 
fixes 1-4 applied. The Amber parm96 forcefield was 
used [95] along with the Onufriev-Bashford-Case gen- 
eralized Born-surface area (OBC GBSA) implicit solvent 
model (corresponding to model I of [96], equivalent 
to igb=2 in Amber's sander program and using the 
mbondi2 radii selected within LEaP). 

A custom Python code making use of the GPU- 
accelerated OpenMM package [97-99] and the Py- 
OpenMM Python wrapper [100] was used to conduct 
the simulations. All forcefield terms are identical to 
those used in AMBER except for the surface area term, 
which was left as default in the OpenMM implementa- 
tion through a GBSAOBCForce term. Parallel temper- 
ing simulations of 2 000 iterations were run, with dy- 
namics propagated by 500 steps each iteration using a 2 
fs timestep and the leapfrog Verlet integrator [101, 102]. 
Velocities were reassigned from the Maxwell-Boltzmann 
distribution each iteration. The Python scripts for sim- 
ulation and data analysis used here are available online 
at http : / /simtk . org/home/ gibbs. 

For the replica-mixing phase, the simulation em- 
ployed either neighbor exchange (Section III B 1) or 
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mixing times (ps) relative speedup 

T2 T ac Tend TN T2 T ac T cnd TN 

1 state move attempted every 0.1 ps 

neighbor exchange 9.51 ± 0.01 65.8 ± 4.2 126.3 ± 4.2 58.1 ± 4.3 L0 L0 L0 II) 

independence sampling 2.586 ± 0.009 42.9 ± 2.4 88.4 ± 2.7 41.5 ± 2.0 3.68 ± 0.01 1.5 ± 0.1 1.43 ± 0.06 1.4 ± 0.1 

Metropolized indep. 2.181 ± 0.006 48.6 ± 4.0 88.3 ± 3.0 46.7 ±3.4 4.36 ± 0.01 1.4 ± 0.1 1.43 ± 0.07 1.2 ± 0.1 
1 state move attempted every 1 ps 

neighbor exchange 95.0 ± 0.2 211.1 ± 58.9 507.6 ± 19.3 167.6 ± 16.0 LCT~ 1.0 1.0 1.0 

independence sampling 25.8 ± 0.1 67.3 ± 3.6 196.0 ± 5.8 63.1 ±3.3 3.69 ± 0.02 3.1 ± 0.9 2.6 ± 0.1 2.7 ± 0.3 

Metropolized indep. 21.6 ± 0.1 66.8 ± 2.4 169.2 ± 4.7 62.1 ± 2.5 4.40 ± 0.02 3.2 ± 0.9 3.0 ± 0.1 2.7 ± 0.3 

TABLE II. Efficiency measures for expanded ensemble alchemical simulation of large LJ sphere in water. Times measuring 
mixing in state space are: T2, estimated from second eigenvalue of empirical state transition matrix; T ac/ estimated from autocor- 
relation time of alchemical state index; T cn d, estimated average end-to-end transit time for alchemical state index, tn is a structural 
parameter, the autocorrelation function of the number of TIP3P oxygen molecules within 0.5 nm (85.3% of the Lennard-Jones aij 
(0.5860 nm)) of the center of the large Lennard-Jones particle. The relative speedup in sampling efficiency is given relative to the 
standard neighbor exchange scheme. 



state i 

T2 


nixing times 

Tac 


>(ps) 

''"end 


struct 

Tcos 


aral corre 

Tsin <fi 


ation time; 

Tcos ip 


'(ps) 

''"sin ip 


neighbor exchange 91.8 ± 0.6 
independence sampling 2.62 ± 0.01 


80 ±2 
1.60 ± 0.06 


360 ± 30 
28.7 ± 0.7 


25 ±2 
12.4 ± 0.5 


110 ±9 
8.7 ±0.4 


25 ±2 
11.8 ±0.6 


66 ±6 
9.1 ± 0.5 



TABLE III. Efficiency measures for parallel tempering simulation of alanine dipeptide in implicit solvent. Mixing times listed 
are: T2, estimated from second eigenvalue of empirical state transition matrix; T ac , estimated from autocorrelation time of alchem- 
ical state index; T cn d, estimated average end-to-end transit time for alchemical state index. Autocorrelation times of trigonometric 
functions of <j> and ip torsion angles are listed as t cos $ , T s i n $ , T coa ^ , T s i n ^ . The statistical error is given as one standard error of the 
mean. 



independence sampling (Section IIIB2), with K 3 at- 
tempted swaps of replica pairs selected at random. The 
efficiency was measured in several ways, shown in Ta- 
ble III. In addition to the standard mixing metrics de- 
scribed in Section III C, an estimate of the configura- 
tional relaxation times was also made; due to the circu- 
lar nature of the torsional coordinates <p an d ip known to 
be slow degrees of freedom for this system [103], we in- 
stead computed the autocorrelation times for sin </>, cos (j>, 
s'mip, and cos0. All replicas were treated as equiva- 
lent, and their raw statistics (e.g. autocorrelation func- 
tions before normalization) were averaged to produce 
these estimates. Statistical error was again estimated by 
blocking. 



As expected, the various metrics indicate that the par- 
allel tempering replicas mix in state space much more 
rapidly with independence sampling than when only 
neighbor exchanges are attempted. The amount by 
which mixing is accelerated depends on the metric used 
to quantify this, but it is roughly one to two orders of 
magnitude. The structural relaxation times also reflect a 
speedup, though much more modest than the accelera- 
tion in state space sampling — roughly a factor of two to 
ten, depending on the metric examined. 



C. Two-dimensional replica exchange umbrella sampling 
of terminally-blocked alanine peptide in implicit solvent 

Finally, we consider a two-dimensional replica ex- 
change umbrella sampling situation, commonly used to 
compute potentials of mean force along two coordinates 
of interest. We again consider the alanine dipeptide in 
implicit solvent, and employ umbrella potentials to re- 
strain the 4> and rp torsions near reference values (</>° , ip®) 
for K = 101 replicas spaced evenly on a 10 x 10 toroidal 
grid, with the inclusion of one replica without any bias 
potential for ease of post-simulation analysis. 

Because harmonic constraints are not periodic, we 
employ periodic bias potential based on the von Mises 
circular normal distribution, 

U' k {x) = k [cos(<£ - <P° k ) + cos(^ - (31) 

where k has units of energy. For sufficiently large values 
of k, this will localize the torsion angles in an approxi- 
mately Gaussian distribution near the reference torsions 
(<^fc> V'fc) with a standard deviation of a = (/3k) 1 / 2 . 

Here, we employ a k of (2-k/30)~ 2 /3~ 1 so that neigh- 
boring bias potentials are separated by 3(7. This was 
sufficient to localize sampling near the reference torsion 
values for most sterically unhindered regions. The sim- 
ulation was run at 300 K, using a 2 fs timestep with 5 ps 
between replica exchange attempts. A total of 2 000 iter- 
ations were conducted, with each iteration consisting of 
mixing the replica state assignments via a state update 
phase, a new velocity assignment from the Maxwell- 
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Boltzmann distribution, propagation of dynamics, and 
writing out the resulting configuration data. The first 
100 iterations were discarded as equilibration. 

The same mixing schemes examined in the parallel 
tempering simulation were evaluated here, and the re- 
sults of the efficiency metrics are summarized in Ta- 
ble IV. Note that the end-to-end time does not have a 
clear interpretation in terms of the average transit time 
between a maximum and minimum thermodynamic pa- 
rameter here — it simply reflects the average time be- 
tween exchanges between a particular localized um- 
brella and the unbiased state. 

As in the parallel tempering case, we find that both 
mixing times in state space and the structural correla- 
tion times are reduced by use of Gibbs sampling, albeit 
to a lesser degree than in the parallel tempering case. 
Here, state relaxation times are reduced by a factor of 
two to six, depending on the metric considered, while 
structural correlation times are reduced by a factor of 
four or five. 



VI. DISCUSSION 

We have presented the framework of Gibbs sampling 
on the joint set of state and coordinate variables to bet- 
ter understand different expanded ensemble and replica 
exchange schemes, and demonstrated how this frame- 
work can identify simple ways to enhance the efficiency 
of expanded ensemble and replica exchange simulations 
by modifying the thermodynamic state update phase of 
the algorithms. While the actual efficiency improvement 
will depend on the system and simulation details, we 
believe there is likely little, if any, drawback to using 
these improvements in a broad range of situations. 

For simulated and parallel tempering simulations, in 
which only the temperature is varied among the ther- 
modynamic states, the recommended scheme (indepen- 
dence sampling updates, Sections III A 2 and III B 2) is 
simple and inexpensive enough to be easily adopted by 
simulated and parallel tempering codes. Because calcu- 
lation of exchange probability requires no additional en- 
ergy evaluations, it is effectively free. Other expanded 
ensemble or replica exchange simulations where the 
potential does not vary between states (such as ex- 
change among temperatures and pressures [48] or pH 
values [104]) are also effectively free, as no additional 
energy evaluations are required in these cases either. As 
long as state space evaluations are cheap compared to 
configuration updates, independence sampling will mix 
more rapidly than neighbor updates, though this ad- 
vantage will be reduced as the interval spent between 
configuration updates by molecular dynamics or Monte 
Carlo simulation or the total time performing these co- 
ordinate updates becomes very small. 

In some cases, exchange of information between pro- 
cessors during replica exchange in tightly coupled par- 
allel codes may incur some cost, mainly in the form of 



latency. In many cases, however, the decrease in mix- 
ing times could more than offset any loss in parallel ef- 
ficiency. If the recommended independence sampling 
schemes would consume a substantial fraction of the it- 
eration time, or where the parallel implementation of 
state updates is already complex, it may still be rel- 
atively inexpensive to simply perform the same state 
update scheme several times, achieving enhanced mix- 
ing with little extra coding or computational overhead. 
Alternatively, the Gibbs sampling formalism could be 
used to design some other scheme that performs fre- 
quent state space sampling only on replicas that are local 
in the topology of the code. 

For simulated scaling [14] or Hamiltonian exchange 
simulations [7-10], independence sampling updates of 
state permutation vector S requires evaluation of the 
reduced potential Uk{x) at all K states for the current 
configuration (in simulated scaling) or all replica con- 
figurations Xk (for Hamiltonian exchange), which re- 
quires more energy evaluations than the neighbor ex- 
change scheme. However, if the intent is to make use of 
the multistate Bennett acceptance ratio (MBAR) estima- 
tor [55], which produces optimal estimates of free en- 
ergy differences and expectations, all of these energies 
are required for analysis anyway, and so the computa- 
tional impact on simulation time is negligible. It is more 
computationally efficient to evaluate these additional re- 
duced potentials during the simulation, instead of post- 
processing simulation data, which is especially true if 
the additional reduced potential evaluations are done in 
parallel. Alternatively, if a simulated scaling simulation 
is run and one does not wish to use MBAR, restricted 
range state updates (Section III A 4) offer improved mix- 
ing behavior with minimal additional number of energy 
evaluations. 

We have found that examining the exchange statis- 
tics, the empirical state transition matrix and its dom- 
inant eigenvalues, is extremely useful in diagnosing 
equilibration and convergence, as well as poor choices 
of thermodynamic states. It is often very easy to see, 
from the diagonally dominant structure of this matrix, 
where regions of poor state overlap occur. Poor over- 
lap among sets of thermodynamic states observed early 
in simulations from the empirical state transition ma- 
trix are likely to also frustrate post-simulation analysis 
with techniques like MBAR and histogram reweighting 
methods [55, 81, 105], making such metrics useful diag- 
nostic tools. 

For more complex state topologies in expanded en- 
semble or replica exchange simulations, where for ex- 
ample several different pressures or temperatures are in- 
cluded simultaneously, there may not exist a simple grid 
of values, or it may not be easy to identify which states 
are the most efficient neighbors. Using independence 
sampling eliminates the need to plan efficient exchange 
schemes among neighbors, or even to determine which 
states are neighbors. This may encourage the addition 
of states that aid in reducing the correlation time of the 
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state i 


nixing times 


(ps) 

^end 


struc 

7"cos <jl 


tural corr 

7"sin 4> 


elation times 

7~cos ip 


, (ps) 


neighbor exchange 82 ± 4 
independence sampling 24.2 ± 0.3 


31.0 ±0.9 
5.45 ± 0.06 


350 ± 30 
175 ±6 


47 ±2 
8.92 ± 0.09 


57 ±2 
9.9 ± 0.1 


26.4 ± 0.8 
5.63 ± 0.04 


27.1 ± 0.9 
6.09 ± 0.04 



TABLE IV. Efficiency measures for two-dimensional replica exchange umbrella sampling for the alanine dipeptide in implicit 
solvent, mixing times ins state space listed are: ti, estimated from second eigenvalue of empirical state transition matrix; r aC/ 
estimated from autocorrelation time of alchemical state index; r cn d, estimated average end-to-end transit time for alchemical 
state index. Autocorrelation times of trigonometric functions of 4> and tjj torsion angles are listed as t cos ^, r s i n ^,, r cos1 /,, r s i n ^,. The 
statistical error is given as one standard error of the mean. 



overall Markov chain solely by speeding decorrelation 
of conformational degrees of freedom, since they will 
automatically couple to states with reasonable phase 
space overlap. 

It is important to stress, however, that expanded en- 
semble and replica exchange simulations are not a cure- 
all for all systems with poor sampling. In the presence 
of a first-order or pseudo-first-order phase transition, 
phase space mixing may still take an exponentially long 
time even when simulated or parallel tempering algo- 
rithms are used [42] . Optimization of the state exchange 
scheme, as described here, can only help so much; fur- 
ther efficiency gains would require design of interme- 
diate states that abolish the first-order phase transition. 
Schemes for optimal state selection are an area of active 
research [26-32]. 

Finally, we observe that the independence sampling 
scheme for a simulated tempering simulation or any 
simulation where the contribution to the reduced po- 
tential is a thermodynamic parameter A multiplying a 
conjugate configuration-dependent variable h(x) natu- 
rally generalizes to a continuous limit. As the number K 
of thermodynamic states A& is increased between some 
fixed lower and upper limits, this process eventually re- 
sults in the thermodynamic state index k effectively be- 
coming a continuous variable A [2]. Such a continuous 
tempering simulation would sample from the joint dis- 
tribution n(x, A) oc exp[— Xh(x) + g(A)], with the contin- 
uous log weighting function g(A) replacing the discrete 
<7fe in simulated tempering simulations. 

The Gibbs sampler and variations on it remain excit- 
ing areas for future exploration, and we hope that our 



conditional state space sampling formulation will make 
it much easier for other researchers to envision, develop, 
and implement new schemes for sampling from mul- 
tiple thermodynamics states. We also hope it encour- 
ages exploration of further connections between the two 
deeply interrelated fields of statistical mechanics and 
statistical inference. 
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